Indian hospitals manage large daily volumes of patient telephone calls covering appointment requests, doctor availability inquiries, laboratory report status, and emergency assistance. Existing automated systems — rigid Interactive Voice Response menus and English-centric chatbots — fail to serve the linguistically diverse patient populations of Indian cities, where Hindi, Marathi, and English are spoken interchangeably within a single conversation. This paper presents the design, implementation, and experimental evaluation of an AI-powered Multilingual Healthcare Calling Support System that addresses this gap through eight integrated technical modules. A custom Natural Language Processing engine classifies patient intent across seven categories with 96.4% accuracy on a twenty-eight-query multilingual test set. A three-stage multilingual detection engine achieves 100% language identification accuracy across Hindi, Marathi, English, and code-mixed speech patterns. A Retrieval Augmented Generation pipeline combining ChromaDB vector storage with Llama 3.1 inference through the Groq API produces factually grounded, hospital-specific conversational responses with average end-to-end latency of 1.4 seconds. A five-category emotion detection module provides patient safety through multi-language emergency keyword detection and immediate 108 escalation. Real telephony is delivered through Twilio Voice API integration verified through three live telephone call tests demonstrating complete appointment booking and emergency escalation workflows. The system is implemented entirely using open-source and affordable cloud-based components, demonstrating practical viability for deployment in Indian hospitals of all scales.
Introduction
Hospitals and clinics in India receive a large number of patient calls daily for appointments, doctor availability, lab reports, and emergency assistance. Managing these calls manually places significant pressure on front-desk staff, leading to poor service quality and high call abandonment rates. In multilingual regions such as Maharashtra, communication often involves a mix of Hindi, Marathi, and English, making it difficult for traditional automated systems that rely on monolingual language processing.
To address these challenges, the study proposes an AI-Powered Multilingual Healthcare Calling Support System that integrates large language models, Retrieval-Augmented Generation (RAG), and cloud telephony. The system supports Hindi, Marathi, English, and code-mixed speech through regular phone calls, enabling patients to interact naturally without requiring smartphones or special applications. Its key contributions include a multilingual healthcare voice agent, a highly accurate language detection engine, real-time RAG-based information retrieval, and emotion-aware responses with emergency escalation mechanisms.
The literature review highlights that existing healthcare conversational agents improve patient engagement but generally lack multilingual capabilities. Previous studies also emphasize the importance of retrieval-based knowledge systems for reducing AI hallucinations and the need for handling code-mixed Indian languages. Research further shows that emotion-aware conversational systems can significantly improve user satisfaction and engagement.
The proposed system follows a six-layer architecture coordinated by a Flask web application. Its major components include:
Intent Detection Engine: Classifies patient queries into categories such as emergency, appointment booking, report checking, doctor availability, and hospital timing.
Multilingual Detection Engine: Uses a three-stage approach to identify Hindi, Marathi, English, and code-mixed speech with high accuracy.
Emotion Detection Module: Detects emotions such as emergency, anxiety, frustration, distress, and positivity, enabling adaptive responses and emergency escalation.
RAG Pipeline: Retrieves relevant hospital information from a knowledge base and combines it with a Llama 3.1 language model to generate accurate, hallucination-free responses.
Database and Web Portal: Stores appointments, patient information, call logs, and provides administrative dashboards.
Experimental evaluation demonstrated strong system performance. The intent detection module achieved 96.4% accuracy with 100% emergency detection accuracy, while the multilingual language detection system achieved 100% accuracy, including for code-mixed queries. Response latency averaged 1.4 seconds, well below the 3-second real-time conversation requirement. Live telephone testing confirmed successful appointment booking, information retrieval, and emergency handling through actual phone calls.
Compared with traditional IVR systems and text-based chatbots, the proposed solution offers multilingual support, natural conversational interaction, real telephone call capability, emotion detection, RAG-based knowledge grounding, emergency escalation, 24/7 availability, and low-cost open-source deployment. Overall, the study demonstrates that AI-driven multilingual healthcare voice agents can significantly improve patient communication and accessibility in Indian healthcare settings while meeting real-time operational requirements.
Conclusion
This paper presented the AI-Powered Multilingual Healthcare Calling Support System — a complete, validated, open-source architecture addressing the critical gap in automated multilingual patient telephone communication for Indian hospitals. The system achieved 96.4% NLP intent detection accuracy, 100% multilingual detection accuracy including code-mixed speech, 100% emergency detection across all three supported languages, and 1.4-second average end-to-end response latency, all validated through real Twilio telephone call tests under live operating conditions.
The primary technical contributions are: a validated multilingual healthcare voice agent for the Indian linguistic context; a lightweight domain-specific multilingual detection engine achieving perfect accuracy without GPU inference; a demonstrated RAG-telephony integration meeting real-time conversational latency requirements; and a safety-first emergency detection framework operating across Hindi, Marathi, and English. The fully open-source implementation using affordable cloud APIs makes the system economically accessible for deployment in Indian hospitals of all scales.
Future directions include fine-tuning a multilingual healthcare LLM on curated code-mixed conversation data, direct integration with hospital management systems for real-time appointment synchronization, WhatsApp and SMS confirmation channels, language-specific neural TTS for more natural Hindi and Marathi voice output, edge deployment using quantized on-premises models for improved privacy and latency, and a multi-tenant platform supporting networks of hospitals through shared AI infrastructure.
References
[1] L. Laranjo et al., \"Conversational agents in healthcare: a systematic review,\" J. Am. Med. Inform. Assoc., vol. 25, no. 9, pp. 1248–1258, 2018.
[2] S. Sitaram et al., \"A survey of code-switched speech and language processing,\" arXiv:1904.00784, 2019.
[3] H. Touvron et al., \"Llama 2: Open foundation and fine-tuned chat models,\" arXiv:2307.09288, 2023.
[4] P. Lewis et al., \"Retrieval-augmented generation for knowledge-intensive NLP tasks,\" Adv. Neural Inf. Process. Syst., vol. 33, pp. 9459–9474, 2020.
[5] M. Porcheron et al., \"Voice interfaces in everyday life,\" in Proc. CHI 2018, pp. 1–12, 2018.
[6] T. Jadczyk et al., \"Artificial intelligence can improve patient management at the time of a pandemic,\" J. Med. Internet Res., vol. 23, no. 1, p. e22959, 2021.
[7] S. Khanuja et al., \"MuRIL: Multilingual representations for Indian languages,\" arXiv:2103.10730, 2021.
[8] A. B. Kocaballi et al., \"The personalization of conversational agents in health care: systematic review,\" J. Med. Internet Res., vol. 21, no. 11, p. e15360, 2019.
[9] S. Pandya and M. Holia, \"Automating customer service using NLP,\" in Proc. 2019 ICICICT, pp. 220–224, 2019.
[10] A. Vaswani et al., \"Attention is all you need,\" Adv. Neural Inf. Process. Syst., vol. 30, 2017.
[11] Z. Guo et al., \"Evaluating large language models: A comprehensive survey,\" arXiv:2310.19736, 2023.
[12] A. Palanica et al., \"Physicians\' perceptions of chatbots in health care,\" J. Med. Internet Res., vol. 21, no. 4, p. e12887, 2019